NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification

Gotovos, Alkis; Burkholz, Rebekka; Quackenbush, John; Jegelka, Stefanie (December 2021, Advances in neural information processing systems)

Modeling the time evolution of discrete sets of items (e.g., genetic mutations) is a fundamental problem in many biomedical applications. We approach this problem through the lens of continuous-time Markov chains, and show that the resulting learning task is generally underspecified in the usual setting of cross-sectional data. We explore a perhaps surprising remedy: including a number of additional independent items can help determine time order, and hence resolve underspecification. This is in sharp contrast to the common practice of limiting the analysis to a small subset of relevant items, which is followed largely due to poor scaling of existing methods. To put our theoretical insight into practice, we develop an approximate likelihood maximization method for learning continuous-time Markov chains, which can scale to hundreds of items and is orders of magnitude faster than previous methods. We demonstrate the effectiveness of our approach on synthetic and real cancer data.
more » « less
Full Text Available
Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification

Gotovos, Alkis; Burkholz, Rebekka; Quackenbush, John; Jegelka, Stefanie (January 2021, 35th Conference on Neural Information Processing Systems (NeurIPS 2021))

Full Text Available
A Novel Deep Learning Model by Stacking Conditional Restricted Boltzmann Machine and Deep Neural Network

https://doi.org/10.1145/3394486.3403184

Kang, Tianyu; Chen, Ping; Quackenbush, John; Ding, Wei (August 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
Full Text Available
Catalysis Clustering with GAN by Incorporating Domain Knowledge

https://doi.org/10.1145/3394486.3403187

Andreeva, Olga; Li, Wei; Ding, Wei; Kuijjer, Marieke; Quackenbush, John; Chen, Ping (July 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
Full Text Available
Clustering on Sparse Data in Non-overlapping Feature Space with Applications to Cancer Subtyping

Kang, Tianyu; Zarringhalam, Kourosh; Kuijjer, Marieke; Chen, Ping; Quackenbush, John; Ding, Wei (November 2018, 2018 IEEE International Conference on Data Mining (ICDM))

This paper presents a new algorithm, Reinforced and Informed Network-based Clustering (RINC), for finding unknown groups of similar data objects in sparse and largely non-overlapping feature space where a network structure among features can be observed. Sparse and non-overlapping unlabeled data become increasingly common and available especially in text mining and biomedical data mining. RINC inserts a domain informed model into a modelless neural network. In particular, our approach integrates physically meaningful feature dependencies into the neural network architecture and soft computational constraint. Our learning algorithm efficiently clusters sparse data through integrated smoothing and sparse auto-encoder learning. The informed design requires fewer samples for training and at least part of the model becomes explainable. The architecture of the reinforced network layers smooths sparse data over the network dependency in the feature space. Most importantly, through back-propagation, the weights of the reinforced smoothing layers are simultaneously constrained by the remaining sparse auto-encoder layers that set the target values to be equal to the raw inputs. Empirical results demonstrate that RINC achieves improved accuracy and renders physically meaningful clustering results.
more » « less
Full Text Available
Cancer subtype identification using somatic mutation data

https://doi.org/10.1038/s41416-018-0109-7

Kuijjer, Marieke Lydia; Paulson, Joseph Nathaniel; Salzman, Peter; Ding, Wei; Quackenbush, John (May 2018, British Journal of Cancer)

Full Text Available

Search for: All records